Skip to content

Conversation

@onmax
Copy link
Contributor

@onmax onmax commented Nov 18, 2025

Improved MCP tool descriptions with structured guidance and added realistic evaluation scenarios.

Impact

Metric Before After
Eval Score 45% 60%

Model: gpt-5.1-codex-mini: maybe we should try other models like sonnet 4.5 which is more common for developers?

Important

Dev server must be running in the same machine at the moment

Changes

  • Added WHEN TO USE / WHEN NOT TO USE sections to tool descriptions
  • Included concrete examples and common paths for documentation tools
  • Clarified parameter usage (slug vs name) with examples
  • Fixed prompt argsSchema to use z.object() for compatibility
  • Added realistic evaluation scenarios based on actual developer questions

Limitations

Several evaluation scenarios are commented out due to MCP prompt limitations:

  • @ai-sdk/mcp does not support converting prompts to tools yet
  • Tests requiring find_documentation_for_topic, deployment_guide, and migration_help prompts are disabled
  • These will be enabled once prompt-to-tool conversion is available

Related

@onmax onmax requested a review from atinux as a code owner November 18, 2025 14:16
@vercel
Copy link
Contributor

vercel bot commented Nov 18, 2025

@onmax is attempting to deploy a commit to the Nuxt Team on Vercel.

A member of the Team first needs to authorize it.

@onmax
Copy link
Contributor Author

onmax commented Nov 18, 2025

I would like to also suggest updating the blog post with this evals solution. What do you think?

https://nuxt.com/blog/building-nuxt-mcp

@onmax onmax force-pushed the improve-mcp-tool-descriptions branch from a3b9343 to 0932366 Compare November 18, 2025 14:24
@HugoRCD HugoRCD self-requested a review November 18, 2025 14:40
@onmax onmax force-pushed the improve-mcp-tool-descriptions branch 2 times, most recently from 98af43f to 79ef06f Compare November 18, 2025 14:41
@HugoRCD
Copy link
Member

HugoRCD commented Nov 18, 2025

@onmax Thank you! Give me some time to review this and familiarize myself with Evalite and yes depending on that we'll probably midifira the blog post. FYI I'm also working on another project related to MCP where this could be useful! 😁

@vercel
Copy link
Contributor

vercel bot commented Nov 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
nuxt Ready Ready Preview Comment Nov 26, 2025 4:57pm

@onmax onmax force-pushed the improve-mcp-tool-descriptions branch 2 times, most recently from f05b915 to 64933ac Compare November 18, 2025 14:53
@onmax
Copy link
Contributor Author

onmax commented Nov 18, 2025

ok @HugoRCD .

I was also thinking to open a PR for Nuxt UI, but I will wait until you decide what's the best approach :)

@onmax onmax force-pushed the improve-mcp-tool-descriptions branch from 64933ac to d5297d6 Compare November 18, 2025 21:47
Enhanced tool descriptions with WHEN TO USE guidance, examples, and common paths. Added realistic evaluation scenarios.
@onmax onmax force-pushed the improve-mcp-tool-descriptions branch from d5297d6 to 812bdff Compare November 18, 2025 22:02
@HugoRCD
Copy link
Member

HugoRCD commented Nov 20, 2025

@onmax Do you have a particular config to run the evals because I tried with the current version and there's this error:
CleanShot 2025-11-20 at 13 44 14@2x

And with the latest the same:
CleanShot 2025-11-20 at 13 43 03@2x

@onmax
Copy link
Contributor Author

onmax commented Nov 20, 2025

are you running the dev server in parallel?

Sorry i didn't mention it in the PR!

image

@HugoRCD
Copy link
Member

HugoRCD commented Nov 20, 2025

are you running the dev server in parallel?

Sorry i didn't mention it in the PR!

image

It's pretty obvious in the end but I had too many apps running at the same time and it wasn't using the right one 😭 (in my defence the error could be a bit more obvious 😂)

@onmax
Copy link
Contributor Author

onmax commented Nov 20, 2025

yes. totally agreed. but i didn't want to write too much custom code for now :)

@atinux
Copy link
Member

atinux commented Nov 25, 2025

Happy to resolve the conflicts @onmax ?

@onmax
Copy link
Contributor Author

onmax commented Nov 25, 2025

Yes 👍

Should i leave the @ts-expect-error - MCP SDK has overly strict Zod type constraints or should I try to solve them?

The CI is currently not happy if I remove them

@danielroe danielroe changed the title improve: MCP tool descriptions and evaluations f89ximprove: MCP tool descriptions and evaluations Nov 25, 2025
@danielroe danielroe changed the title f89ximprove: MCP tool descriptions and evaluations fix: improve MCP tool descriptions and evaluations Nov 25, 2025
@onmax onmax force-pushed the improve-mcp-tool-descriptions branch from 1ffa673 to 0073df0 Compare November 25, 2025 16:09
@onmax
Copy link
Contributor Author

onmax commented Nov 26, 2025

I resolved the conflict btw @atinux

@HugoRCD
Copy link
Member

HugoRCD commented Nov 26, 2025

@onmax Not a fan of having to put these any and expect-error everywhere but I guess we don't really have a choice until zod 4 support is added 🥲

@onmax
Copy link
Contributor Author

onmax commented Nov 26, 2025

🫂 i feel you...

@atinux atinux merged commit 22d89e2 into nuxt:main Nov 27, 2025
4 of 5 checks passed
@onmax onmax deleted the improve-mcp-tool-descriptions branch December 1, 2025 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants